Articles & Research Notes

Design False: The Structural Dismantling of exFAT Mapping Logic (CVE-2025-22036)

This analysis deliberately eschews the provision of security advisories, mitigations, or remediation patches. The objective is strictly analytical to dismantle the fundamental design assumptions within the Linux exFAT subsystem that facilitated CVE-2025-22036. By applying a rigorous offensive framework, this study interrogates the failure of the Ownership Module and the collapse of the Thread Model under specific race-condition primitives.

The research follows the formalized methodologies established at: https://bytrep.com/METHODOLOGIES.html

1. Environmental Assumptions

The exFAT design hypothesis, as codified in the MS-EXFAT specification (Section 8.1), externalizes its consistency requirements to the host environment, introducing several critical hidden assumptions:

Assumption of Sequential Determinism (Execution Order):

By defining a "Recommended Write Ordering" without mandated hardware barriers, the design implicitly assumes that the execution environment preserves the logical sequence of metadata updates. IAEM extracts this as an unstated reliance on the absence of adversarial instruction reordering by the compiler or CPU.

Assumption of Scheduling Benignity:

The multi-step update protocol (8.1, Steps 1-5) assumes that the scheduler will not interleave a second execution thread during the transitional state between VolumeDirty=1 and VolumeDirty=0. This represents an implicit dependency on Temporal Isolation, assuming the environment is non-hostile during metadata mutation windows.

Assumption of Cooperative Compliance (Layer Trust):

Section 8.2 enforces behavioral restrictions on "unrecognized entries" through normative language rather than structural invariants. The design assumes that all system components will voluntarily respect ownership boundaries, failing to provide hard enforcement against a compromised or adversarial implementation.

2. Concurrency & Interleaving Assumptions

The exFAT implementation within the Linux Kernel reveals a profound reliance on Implicit Sequentiality. Through IAEM, we extract the following hidden dependencies:

Implicit Sequentiality of Buffer-Head Operations:

The design assumes that the logical flow from offset calculation to block attachment is an indivisible unit of execution. It fails to declare a structural invariant that prevents state mutation (e.g., via mpage_read_folio) between these two points. The hypothesis survives only under the assumption that the environment guarantees Atomicity by Coincidence.

Assumption of Thread Cohesion (Global State Visibility):

There is an unstated assumption that all CPU cores maintain a perfectly synchronized view of the inode metadata without explicit memory barriers (smp_mb()). The design implicitly relies on the hardware's cache coherency protocols to substitute for software-level synchronization invariants.

Externalized Serialization Dependency:

The design assumes the existence of an external Serialization Point (likely at the VFS layer). By not implementing internal reentrancy checks or state verification within exfat_get_block, the design implicitly claims that illegal concurrent states are "unreachable by convention" rather than "impossible by design."

3. State Space Assumptions

The exFAT design hypothesis fails to bound the system's reachable state space, relying instead on the perceived impossibility of transitional corruption. Through IAEM, we expose:

Assumption of Structural Infallibility:

The design assumes that objects within the stack frame (specifically Buffer Heads) are immune to external mutation during the execution of I/O primitives like end_buffer_read_sync. It treats the existence of a valid bh as a Design Invariant when it is, in fact, an unprotected Temporal Variable.

Presumption of Atomic State Transitions:

There is an implicit assumption that the system cannot dwell in an "Intermediate State" where a buffer is partially mapped or pending unlock while its underlying folio is invalidated. The design assumes these states are Unreachable without Proof, effectively gambling on the speed of execution rather than the strength of synchronization.

The Fallback Validity Assumption:

By providing fallback mechanisms (unlock_buffer) without re-verifying object ownership or integrity, the design assumes that the failure domain is always contained. It assumes that even in a failure state, the structural integrity of the memory object remains intact—a classic Implicit Trust in the persistence of the failure context.

4. Ownership & Authority Assumptions

The exFAT design hypothesis fails to establish a rigorous Ownership Module, relying on a fragmented and implicit authority structure that collapses under concurrency.

The Fallacy of Stack-Local Authority:

The design assumes that placing a Buffer Head (bh) reference on a thread's stack frame implicitly secures its ownership. IAEM extracts a hidden assumption of Exclusive Local Scope, which ignores the reality of the Linux Buffer Cache where the bh remains a globally reachable and mutable resource.

Undefined State Mutation Authority:

There is no formalized "Write/Read State Model" governing the interaction between the I/O subsystem and the exFAT API. The design assumes Mutual Non-Interference, meaning it trusts that no other core or component will assert authority over a buffer while it is pending in wait_on_buffer.

Ownership Anarchy during Fallback Execution:

By delegating cleanup to generic fallback paths like unlock_buffer without verifying structural integrity, the design assumes that Authority is Persistent. It fails to account for "Ownership Theft" (Cross-thread violation), where the right to mutate the object is exercised by an adversarial thread before the legitimate owner can finalize the state.

5. Lifetime Assumptions

The exFAT implementation operates under precarious and undeclared lifecycle dependencies, failing to synchronize the object's physical existence with its logical validity.

The Persistence Illusion:

The design assumes that an object's lifetime is inherently tied to the execution flow of the calling thread. IAEM extracts a hidden assumption of Automatic Structural Persistence, where the system trusts that a Buffer Head cannot be reclaimed as long as it is referenced within a stack frame. This ignores the asynchronous nature of memory reclamation in the Linux Kernel.

Assumption of Invisible Invalidation:

There is an implicit assumption that no "Adversarial Reclamation" can occur during the window between submit_bh and unlock_buffer. The design fails to enforce an invariant that mandates "Use must precede destruction" across all cores, assuming instead that the memory management subsystem will cooperate to keep objects alive regardless of concurrent state-changing operations like truncate.

The Lifecycle-State Disconnect:

The hypothesis assumes that the transition to an "Invalid State" (e.g., folio truncation) is mutually exclusive with active I/O processing. By not explicitly declaring "No access during invalid states" as a structural enforcement, the design relies on the Environmental Assumption that the lifecycle of metadata is globally serialized, which is demonstrably false in high-concurrency scenarios.

6. Trust Assumptions

The exFAT design hypothesis is built upon a foundation of Blind Transitive Trust, outsourcing its security invariants to external kernel subsystems and ignoring the potential for adversarial interference at the API boundaries.

Absolute Dependency on Lower-Layer Infallibility:

The design assumes that the Block I/O layer is a closed, trusted circuit. IAEM extracts a hidden assumption that once a buffer is submitted via submit_bh, its integrity is guaranteed until it is unlocked. It fails to account for "Man-in-the-Middle" state mutations occurring within the kernel's own synchronization primitives.

The Sanctity of the VFS-to-FS Contract:

There is an implicit trust that the VFS layer maintains a perfect, immutable state of file metadata (inode fields) for the duration of a get_block call. The design assumes that the caller provides a "Sanitized Environment," failing to implement local verification of file boundaries during the critical window of block mapping.

Trust in Environmental Non-Adversarialism:

The hypothesis assumes that the Linux Memory Management (MM) subsystem acts as a cooperative partner that will never reclaim or invalidate resources in a way that contradicts exFAT's internal logic. This represents an unstated Trust Invariant in the stability of the global state, treating the host kernel as a benign entity rather than a complex, potentially hostile scheduling environment.

7. Boundary & Scope Assumptions

The exFAT design suffers from Scope Myopia, incorrectly assuming that the operational boundaries of a simple file system will remain intact within the chaotic environment of a modern monolithic kernel.

The Illusion of Contextual Isolation:

The design assumes a "Zero-Contention" boundary, where it implicitly relies on the absence of high-frequency parallel execution against the same inode. IAEM exposes this as a failure to define a Concurrent Scope, assuming instead that the system operates in a serialized vacuum.

Unstated Local-Only Assumptions:

By failing to implement cross-core state validation, the hypothesis assumes that the "Global State" is merely a collection of "Local States." It implicitly assumes that no external actor can bridge the gap between different execution scopes (e.g., a concurrent truncate thread and a mapping thread), treating these as disjoint events.

Assumption of Environmental Benevolence:

The design scope is limited to "Normal Operating Conditions," assuming the absence of adversarial timing or resource pressure. IAEM formalizes the unstated assumption that the system does not need to defend its boundaries against hostile interleaving, as it treats such scenarios as "Out-of-Scope" despite their physical reachability in the Linux Kernel.

8. Resource Stability Assumptions

The exFAT design operates under a "Best-Case Scenario" fallacy, assuming that the stability of underlying system resources is a guaranteed constant rather than a volatile variable.

The Assumption of Immutable Allocation:

Once a resource (e.g., Buffer Head) is acquired, the design assumes its structural integrity is immutable for the duration of the local stack frame. IAEM extracts a hidden dependency on External Resource Locking, assuming that the memory management subsystem will never revoke or invalidate a resource that is currently "in-flight" within the exFAT logic.

Implicit Reliance on Deterministic Latency:

The hypothesis assumes that dependent services (Block I/O) will respond within a predictable timeframe. It implicitly treats Execution Speed as a substitute for Formal Synchronization, assuming that the resource will remain stable because "the window for failure is too small"—a dangerous assumption of stability through probability rather than design.

Failure Domain Blindness:

The design assumes that any resource-related failure will be catastrophic and immediate (Stop-and-Fail), rather than subtle or transitional. It fails to declare invariants for "Partial Resource Success," assuming instead that the environment will either provide a perfectly stable resource or a clear error, leaving no room for the Adversarial State where a resource is partially reclaimed or inconsistent.

9. Extraction Procedure: The Exposed Hypothesis

I. The Formalized List of Extracted Assumptions

  • Environmental: Reliance on Monotonic Execution and the assumed absence of physical instruction reordering by the host environment.
  • Concurrency: Dependency on Coincidental Atomicity, assuming that adversarial scheduling will not interleave during critical I/O windows.
  • State Space: The presumption that Intermediate/Transitional States are unreachable, treating the absence of structural proof as a guarantee of state integrity.
  • Ownership & Authority: An implicit claim of Transient Exclusivity over stack-allocated objects (bh), ignoring the globally mutable nature of the Buffer Cache.
  • Lifetime: The unstated requirement that Stack-Frame Persistence equals logical validity, assuming the absence of external "Adversarial Reclamation" (e.g., via concurrent truncation).
  • Trust: A reliance on Transitive Trust, delegating the responsibility for metadata integrity entirely to the VFS and Block layers without local verification.

II. The Final "Exposed" Hypothesis

The exfat_get_block design hypothesis operates under the unstated requirement that the host environment (Linux Kernel) provides inherent, non-adversarial synchronization and perfect ownership stability. It assumes that legal states are maintained through logical sequence alone, failing to implement internal structural invariants to defend against cross-thread state violations or resource invalidation during asynchronous execution.

IFHM Terminology Binding

Each technical term must be bound to an operational reality. According to the IFHM protocol, any definition that cannot be directly converted into a verifiable invariant is rejected, potentially triggering a Design False conclusion. The following operational definitions are established:
Atomic Buffer Window (ABW): The execution span between the invocation of submit_bh() and the immediate instruction following wait_on_buffer(). It is operationally defined by the continuous retention of the BH_Lock state.
Object Anchor (OA): The hardware-level pointer consistency between a buffer_head (bh) and its associated folio/page. It must remain a non-null, immutable link during I/O processing to prevent structural decoupling.
Reference Authority (RA): The absolute value of the buffer's reference counter (atomic_read(&bh->b_count)). It is the local mechanism used to prevent the Memory Management (MM) layer from reclaiming the object during a context switch.
State Coherency (SC): The synchronization status between the cached file metadata (inode->i_size) and the physical block mapping. It requires post-sleep re-validation to ensure the mapping remains legal relative to the current file boundaries.

Each of these definitions is Operational (measurable via kernel primitives), Local (confined to the buffer_head and thread scope), Non-temporal (state-based rather than time-based), and Non-probabilistic (binary state: valid or invalid). If the design cannot enforce the Object Anchor (OA) integrity against external kernel subsystems (like truncate or reclamation), the term is rejected, and the design is declared False.

Problem Delimitation

The problem addressed by this hypothesis is strictly defined as the legal possibility of an Invariant Violation within the Linux Kernel's concurrent execution environment. According to the IFHM protocol, behavioral descriptions are disregarded in favor of structural contradictions.

1. Targeted Problem:

The fundamental problem is the Structural Decoupling between the buffer_head (bh) and its parent folio. Specifically, the design allows for a state where the Object Anchor (OA) Invariant (defined in Chapter 1) is violated. This occurs when an external thread (e.g., truncate or reclaim) invalidates the resource while the exfat_get_block thread is in a sleep state within wait_on_buffer().

2. Scope:

The scope is limited to the Post-Sleep Re-entry Phase. The delimitation focuses on the window of time after the thread wakes up but before it completes the metadata attachment. This is a purely Local and Operational scope within the Linux Kernel's block-mapping logic.

3. Non-Addressed Elements:

This hypothesis does not address general performance overhead, standard I/O errors (E/O), or hardware-level corruption. It strictly ignores any "probabilistic" safety (i.e., "this race is rare") and focuses exclusively on the Legal Reachability of an invalid state.

4. Formal Problem Statement:

The problem is defined as: "The existence of a code path in exfat_get_block where the Reference Authority (RA) and Object Anchor (OA) can be nullified by the Memory Management (MM) layer without triggering a local fault-handler or a re-validation routine, thereby allowing an Invariant Violation by design."

Invariant Declaration

Based on the Exposed Hypothesis (Original + Hidden Assumptions), we declare the following formal invariants. These represent the structural boundaries that the design MUST satisfy to be considered valid. Any dependency on timing, probabilistic safety, or "should not happen" scenarios is strictly rejected.

The Reference Stability Invariant (RSI):
Definition: The atomic_read(&bh->b_count) must be >0 at all entry, sleep, and exit points within the mapping routine.
Criteria Check: This is Explicit (mapped to b_count), Local (scoped to the buffer), and Non-probabilistic (either it is referenced or it is not). Legality: Legally unbreakable only if the local API increments the reference before any potential context switch.
The Anchor Integrity Invariant (AII):
Definition: The pointer bh->b_page (or folio link) must remain immutable and non-null during the entire execution of exfat_get_block().
Criteria Check: This is Local and Non-temporal. It does not rely on "when" the page is accessed, but on the "fact" of its attachment. Legality: This is the "Killing Point." If the Kernel's MM layer can decouple the page while the thread is sleeping, this invariant is Legally Breakable, triggering a design failure.
The Boundary Synchronization Invariant (BSI):
Definition: The calculated block mapping must strictly satisfy the condition: Offset+Length≤i_size.
Criteria Check: This is Explicit and Non-temporal. It is a mathematical boundary check. Legality: Must be re-validated post-sleep. If the design assumes i_size cannot change without a local lock, the invariant is Invalid as it relies on Context/Ordering.
The Lock-State Invariant (LSI):
Definition: The BH_Lock bit must be the exclusive authority for state mutation. No execution path may modify the buffer's content if buffer_locked(bh) is true for another thread.
Criteria Check: Explicit and Operational. Invalidity Clause: If the design "hopes" that other threads will respect the lock without a formal enforcement mechanism in all paths, this invariant is Rejected.

Hypothesis Commitment

The hypothesis for the Exposed exFAT Mapping Logic is strictly defined as a set of formal commitments to preserve the invariants declared in Chapter 3. Any safety property not explicitly committed to here is considered legally absent from the design.

1. The Commitment Set:

The design hypothesis commits solely to the preservation of:

  • The Reference Stability Invariant (RSI): A commitment to ensure the buffer's reference count remains >0 during local execution.
  • The Lock-State Invariant (LSI): A commitment to respect the BH_Lock bit as the primary arbiter of state change.
2. The Absence of Structural Commitment (The Gap):

Since the Anchor Integrity Invariant (AII) and the Boundary Synchronization Invariant (BSI) were found to rely on "context" and "timing" (making them invalid under IFHM rules), the hypothesis does not commit to preserving the link between the buffer_head and its folio post-sleep, nor does it commit to the absolute synchronization of i_size against concurrent truncation.

3. Final Hypothesis Formulation:

The exFAT block-mapping design commits to maintaining local object references and lock-states, but explicitly acknowledges the absence of structural commitments regarding cross-layer object anchoring or global metadata coherency. The design relies on the assumption that external kernel subsystems will not exercise their legal right to invalidate resources during the Atomic Buffer Window (ABW).

Logical Construction

This chapter evaluates the logical transitions within the hypothesis. Under IFHM, every transition must function as an explicit implication that propagates at least one declared invariant. Any transition relying on heuristics or the absence of interference is rejected.

1. The Inter-Core Transition Gap:

The transition from the Pre-sleep State (Initial mapping) to the Post-sleep State (I/O completion) fails the logical propagation test. Specifically:

  • Transition: wait_on_buffer(bh) → Context Re-entry.
  • Invariant Propagation: NONE. There is no invariant (neither RSI nor LSI) that legally prevents an external core from decoupling the Object Anchor (OA) during this transition.
2. Rejection of Heuristic Reasoning:

The design's reliance on the "unlikelihood" of a race condition between submit_bh and unlock_buffer is classified as Probabilistic Reasoning. Under IFHM Rule 3, this reasoning is Forbidden. The lack of a mandatory, hardware-level, or software-locked implication that preserves the buffer_head structural integrity across context switches means the transition is Logically Disconnected.

3. Invariant Failure in Fallback Paths:

The fallback logic for struct bh does not carry an Ownership Invariant from the thread module to the kernel's memory management module. Consequently, the transition between these modules results in an Invariant Void.

Final Verdict: Logical Disintegration

Because the hypothesis fails to propagate a legally unbreakable invariant across the context-switch boundary, the logical chain is broken. The transition from 'Requested Block' to 'Assigned Block' is not an explicit implication but a Heuristic Hope. Therefore, the logical construction is REJECTED.

Design Failure & Vulnerability Classification

The hypothesis failed to complete the IFHM hardening process. The structural collapse occurred during the Logical Construction phase, specifically at the inter-core transition boundary. The design lacks a verifiable invariant to bridge the state gap between the pre-sleep mapping and the post-sleep re-entry, representing a fundamental breakdown in the Thread/Ownership Module.

Furthermore, the reliance on an unprotected refcount during the fallback path confirms that the system operates in an 'Invariant Void' during high-concurrency scenarios. Consequently, this vulnerability is not classified as a traditional stack corruption; rather, it is a Structural Ownership Confusion and a Cross-Thread State Violation. The design's inability to enforce an 'Object Anchor' across execution contexts renders the mapping logic inherently unsafe.

The structural collapse we identified isn't just a theoretical anomaly; it is a physical reality within the Linux Kernel's handling of exFAT. The Ownership Confusion we exposed serves as the fundamental engine for exploitation. For a technical demonstration and the proof-of-concept (POC) validating this design failure, visit: bytrep.com/exfat.html

Conclusion

I approached this research with the hope of finding a robust structural foundation for the exFAT mapping hypothesis. Unfortunately, the design proved to be practically indefensible. It failed to maintain integrity at every critical checkpoint: from its primitive definitions and unstable constants to its logically disconnected transitions between execution cores.

When a design relies on the 'silence' of the kernel rather than the 'strength' of its own invariants, it ceases to be a secure architecture and becomes a liability. The hypothesis did not merely fail; it dissolved under the rigor of the IFHM framework. Consequently, the exFAT block-mapping logic is officially declared Design False.

Written by the Security Research Team – Bytrep, March 2026